Add API to sort array of custom objects #103

r-devulap · 2023-11-15T20:46:54Z

Benchmarks sorting an array of 2D and 3D cartesian coordinates:

Benchmark                                Time             CPU   Iterations                                                                                          [35/27249]
--------------------------------------------------------------------------
scalarobjsort<Point2D>/1000          54107 ns        54100 ns        12908
scalarobjsort<Point2D>/10000       1172135 ns      1172105 ns          597
scalarobjsort<Point2D>/100000     14652163 ns     14651279 ns           48
scalarobjsort<Point2D>/1000000   174384347 ns    174363797 ns            4
scalarobjsort<Point2D>/10000000 2042991245 ns   2042818194 ns            1
scalarobjsort<Point3D>/1000          54005 ns        53998 ns        12886
scalarobjsort<Point3D>/10000       1240230 ns      1240178 ns          564
scalarobjsort<Point3D>/100000     15452639 ns     15451391 ns           45
scalarobjsort<Point3D>/1000000   188752147 ns    188723385 ns            4
scalarobjsort<Point3D>/10000000 2182026309 ns   2181807823 ns            1
RUNNING: ./benchexe --benchmark_filter=simdobjsort.* --benchmark_out=/tmp/tmpw4_wn2mr
2023-11-15T12:55:24-08:00
Running ./benchexe
Run on (20 X 1258 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x10)
  L1 Instruction 32 KiB (x10)
  L2 Unified 1024 KiB (x10)
  L3 Unified 14080 KiB (x1)
Load Average: 0.72, 0.25, 0.17
------------------------------------------------------------------------
Benchmark                              Time             CPU   Iterations
------------------------------------------------------------------------
simdobjsort<Point2D>/1000          14763 ns        14775 ns        47513
simdobjsort<Point2D>/10000        199342 ns       199357 ns         3474
simdobjsort<Point2D>/100000      3887458 ns      3887530 ns          181
simdobjsort<Point2D>/1000000    97432099 ns     97428446 ns            6
simdobjsort<Point2D>/10000000 2243062153 ns   2242956708 ns            1
simdobjsort<Point3D>/1000          16229 ns        16238 ns        43099
simdobjsort<Point3D>/10000        213301 ns       213314 ns         3274
simdobjsort<Point3D>/100000      4175434 ns      4175449 ns          168
simdobjsort<Point3D>/1000000   105489606 ns    105483514 ns            6
simdobjsort<Point3D>/10000000 2305455681 ns   2305295989 ns            1
Comparing scalarobjsort.* to simdobjsort.* (from ./benchexe)
Benchmark                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------
[scalarobjsort.* vs. simdobjsort.*]                -0.7272         -0.7269         54107         14763         54100         14775
[scalarobjsort.* vs. simdobjsort.*]                -0.8299         -0.8299       1172135        199342       1172105        199357
[scalarobjsort.* vs. simdobjsort.*]                -0.7347         -0.7347      14652163       3887458      14651279       3887530
[scalarobjsort.* vs. simdobjsort.*]                -0.4413         -0.4412     174384347      97432099     174363797      97428446
[scalarobjsort.* vs. simdobjsort.*]                +0.0979         +0.0980    2042991245    2243062153    2042818194    2242956708
[scalarobjsort.* vs. simdobjsort.*]                -0.6995         -0.6993         54005         16229         53998         16238
[scalarobjsort.* vs. simdobjsort.*]                -0.8280         -0.8280       1240230        213301       1240178        213314
[scalarobjsort.* vs. simdobjsort.*]                -0.7298         -0.7298      15452639       4175434      15451391       4175449
[scalarobjsort.* vs. simdobjsort.*]                -0.4411         -0.4411     188752147     105489606     188723385     105483514
[scalarobjsort.* vs. simdobjsort.*]                +0.0566         +0.0566    2182026309    2305455681    2181807823    2305295989
[scalarobjsort.* vs. simdobjsort.*]_pvalue          0.6776          0.6776      U Test, Repetitions: 10 vs 10
OVERALL_GEOMEAN                                    -0.6203         -0.6202             0             0             0             0

thiagomacieira · 2023-11-15T21:12:17Z

lib/x86simdsort.h

 #define UNUSED(x) (void)(x)

+template <typename T>
+XSS_HIDE_SYMBOL void permute_array_in_place(T *A, std::vector<size_t> P)


Move to an x86simdsort::detail namespace.

Take the P parameter by const-reference or change the call site to use std::move(arg).

thiagomacieira · 2023-11-15T21:13:51Z

lib/x86simdsort.h

+    using return_type_of =
+            typename decltype(std::function {key_func})::result_type;


This is probably a C++17 technique (CTAD). If you need to support pre-C++17, you may need to rewrite it.

thiagomacieira · 2023-11-15T21:14:21Z

lib/x86simdsort.h

+{
+    using return_type_of =
+            typename decltype(std::function {key_func})::result_type;
+    std::vector<return_type_of> keys;


Add: keys.reserve(arrsize).

thiagomacieira · 2023-11-15T21:22:27Z

lib/x86simdsort.h


+// sort an object
+template <typename T, typename F>
+XSS_EXPORT_SYMBOL void object_qsort(T *arr, size_t arrsize, const F key_func)


This would be more idiomatic in C++ if you did:

template <typename It, typename F> void object_qsort(It begin, It end, F &&key_func) { using T = typename std::iterator_traits<It>::value_type; #if __cplusplus >= 201703L using R = std::invoke_result_t<F, T>; #else using R = std::result_of_t<F>; #endif std::vector<R> keys; keys.reserve(std::distance(begin, end)); for (auto it = first; it != end; ++it) keys.emplace_back(key_func(*it));

thiagomacieira reviewed Nov 15, 2023

View reviewed changes

Raghuveer Devulapalli added 6 commits November 29, 2023 12:29

Add method to sort array/vector of custom objects

7c42a0d

Add benchmarks for objsort

8f8dd4a

Use the permute array in-line

38fbbc8

Use distance

0d6ffa9

Use key-value sort instead of argsort

9a92ab0

Add more distance metrics

fbc033e

r-devulap force-pushed the customsort-expt branch from e680504 to fbc033e Compare November 30, 2023 20:55

r-devulap merged commit d9c9737 into numpy:main Nov 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add API to sort array of custom objects #103

Add API to sort array of custom objects #103

Uh oh!

r-devulap commented Nov 15, 2023 •

edited

Loading

Uh oh!

thiagomacieira Nov 15, 2023

Uh oh!

thiagomacieira Nov 15, 2023

Uh oh!

thiagomacieira Nov 15, 2023

Uh oh!

thiagomacieira Nov 15, 2023

Uh oh!

Uh oh!

		using return_type_of =
		typename decltype(std::function {key_func})::result_type;

Uh oh!

Add API to sort array of custom objects #103

Add API to sort array of custom objects #103

Uh oh!

Conversation

r-devulap commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thiagomacieira Nov 15, 2023

Choose a reason for hiding this comment

Uh oh!

thiagomacieira Nov 15, 2023

Choose a reason for hiding this comment

Uh oh!

thiagomacieira Nov 15, 2023

Choose a reason for hiding this comment

Uh oh!

thiagomacieira Nov 15, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

r-devulap commented Nov 15, 2023 •

edited

Loading